AITopics | esr criterion

Collaborating Authors

esr criterion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Multi-Objective Reinforcement Learning for Utility-Based Infrastructural Maintenance Optimization

van Remmerden, Jesse, Kenter, Maurice, Roijers, Diederik M., Andriotis, Charalampos, Zhang, Yingqian, Bukhsh, Zaharah

arXiv.org Artificial IntelligenceJun-10-2024

In this paper, we introduce Multi-Objective Deep Centralized Multi-Agent Actor-Critic (MO- DCMAC), a multi-objective reinforcement learning (MORL) method for infrastructural maintenance optimization, an area traditionally dominated by single-objective reinforcement learning (RL) approaches. Previous single-objective RL methods combine multiple objectives, such as probability of collapse and cost, into a singular reward signal through reward-shaping. In contrast, MO-DCMAC can optimize a policy for multiple objectives directly, even when the utility function is non-linear. We evaluated MO-DCMAC using two utility functions, which use probability of collapse and cost as input. The first utility function is the Threshold utility, in which MO-DCMAC should minimize cost so that the probability of collapse is never above the threshold. The second is based on the Failure Mode, Effects, and Criticality Analysis (FMECA) methodology used by asset managers to asses maintenance plans. We evaluated MO-DCMAC, with both utility functions, in multiple maintenance environments, including ones based on a case study of the historical quay walls of Amsterdam. The performance of MO-DCMAC was compared against multiple rule-based policies based on heuristics currently used for constructing maintenance plans. Our results demonstrate that MO-DCMAC outperforms traditional rule-based policies across various environments and utility functions.

objective, probability, utility function, (16 more...)

arXiv.org Artificial Intelligence

2406.06184

Country:

Europe > Netherlands > North Holland > Amsterdam (0.25)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Monte Carlo Tree Search Algorithms for Risk-Aware and Multi-Objective Reinforcement Learning

Hayes, Conor F., Reymond, Mathieu, Roijers, Diederik M., Howley, Enda, Mannion, Patrick

arXiv.org Artificial IntelligenceDec-6-2022

In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from a single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. Making decisions using just the expected future returns -- known in reinforcement learning as the value -- cannot account for the potential range of adverse or positive outcomes a decision may have. Therefore, we should use the distribution over expected future returns differently to represent the critical information that the agent requires at decision time by taking both the future and accrued returns into consideration. In this paper, we propose two novel Monte Carlo tree search algorithms. Firstly, we present a Monte Carlo tree search algorithm that can compute policies for nonlinear utility functions (NLU-MCTS) by optimising the utility of the different possible returns attainable from individual policy executions, resulting in good policies for both risk-aware and multi-objective settings. Secondly, we propose a distributional Monte Carlo tree search algorithm (DMCTS) which extends NLU-MCTS. DMCTS computes an approximate posterior distribution over the utility of the returns, and utilises Thompson sampling during planning to compute policies in risk-aware and multi-objective settings. Both algorithms outperform the state-of-the-art in multi-objective reinforcement learning for the expected utility of the returns.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2211.13032

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
Asia > Singapore (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry:

Energy (1.00)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Expected Scalarised Returns Dominance: A New Solution Concept for Multi-Objective Decision Making

Hayes, Conor F., Verstraeten, Timothy, Roijers, Diederik M., Howley, Enda, Mannion, Patrick

arXiv.org Artificial IntelligenceJun-2-2021

In many real-world scenarios, the utility of a user is derived from the single execution of a policy. In this case, to apply multi-objective reinforcement learning, the expected utility of the returns must be optimised. Various scenarios exist where a user's preferences over objectives (also known as the utility function) are unknown or difficult to specify. In such scenarios, a set of optimal policies must be learned. However, settings where the expected utility must be maximised have been largely overlooked by the multi-objective reinforcement learning community and, as a consequence, a set of optimal solutions has yet to be defined. In this paper we address this challenge by proposing first-order stochastic dominance as a criterion to build solution sets to maximise expected utility. We also propose a new dominance criterion, known as expected scalarised returns (ESR) dominance, that extends first-order stochastic dominance to allow a set of optimal policies to be learned in practice. We then define a new solution concept called the ESR set, which is a set of policies that are ESR dominant. Finally, we define a new multi-objective distributional tabular reinforcement learning (MOT-DRL) algorithm to learn the ESR set in a multi-objective multi-armed bandit setting.

esr criterion, utility function, value distribution, (15 more...)

arXiv.org Artificial Intelligence

2106.01048

Country:

Europe > Belgium (0.04)
Europe > Netherlands (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Health & Medicine > Therapeutic Area > Vaccines (0.30)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback